Training Large Language Models to Reason in a Continuous Latent Space